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Abstract 

A nonparametric kernel density estimator for directional-linear data is introduced. The 
proposal is based on a product kernel accounting for the different nature of both (directional 
and linear) components of the random vector. Expressions for bias, variance and mean integrated 
square error (MISE) are derived, jointly with an asymptotic normality result for the proposed 
estimator. For some particular distributions, an explicit formula for the MISE is obtained and 
compared with its asymptotic version, both for directional and directional-linear kernel density 
estimators. In this same setting a closed expression for the bootstrap MISE is also derived. 

Keywords: Directional-linear data; Kernel density estimator; Nonparametric statistics. 

1 Introduction 

Kernel density estimation, and kernel smoothing methods in general, is a classical topic in non- 
parametric statistics. Starting from the first papers by Rosenblatt (1956) and Parzen (1962), ex- 
tensions of the kernel density methodology have been brought up in different contexts, dealing with 
other smoothers, more complex data (censorship, truncation, dependence) or dynamical models 
(see Miiller (2006) for a review). Some comprehensive references in this topic include the books by 
Silverman (1986), Scott (1992) and Wand and Jones (1995), among others. 

Beyond the linear case, kernel density estimation has been also adapted to directional data, that 
is, data in the g-dimensional sphere. Hall et al. (1987) define two type of kernel estimators and 
give asymptotic formulae of bias, variance and square loss. Almost simultaneously, Bai et al. (1988) 
established the pointwise and uniformly strong consistency, and ll\ consistency of a quite similar 
estimator in the same context. Later, Zhao and Wu (2001) stated a central limit theorem for the 
integrated square error of the previous kernel density estimator based on the ^/-statistic martingale 
ideas developed by Hall (1984). Some of the results by Hall et al. (1987) were extended by Klemela 
(2000), who studied the estimation of the Laplacian of the density and other types of derivatives. 
All these references consider the data lying on a general g-sphere of arbitrary dimension g, which 
comprises as particular cases circular data (g = 1) and spherical data (g = 2). For the partic- 
ular case of circular data, there exist more recent works dealing with the problem of smoothing 
parameter selection in kernel density estimation, such as Taylor (2008) and Oliveira et al. (2012). 
Di Marzio et al. (2011) study the kernel density estimator on the g-dimensional torus, and pro- 
pose some bandwidth selection methods. Recently, a more general approach has been followed by 
Pelletier (2005) and Henry and Rodriguez (2009), who present a wider but more complex setting 
considering data in generic Riemannian manifolds. Nevertheless, the original approach seems to 
present a good balance between generality and complexity. 

The aim of this work is to introduce and derive some basic properties of a joint kernel density 
estimator for directional-linear data, i.e. data with a directional and a linear component. This type 
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of data arise in a variety of applied fields such as meteorology (wind direction and wind speed), 
oceanography (in the study of sea currents) and environmental sciences, among others. As an 
example, such an estimator has been used by Garcia-Portugucs et al. (2012) for studying the rela- 
tion between pollutants and wind direction in the presence of an emission source. Specifically the 
asymptotic properties of the directional-linear kernel density estimator derived in this work include 
the bias, variance and asymptotic normality. As a by-product, the Mean Squared Error (MSE) 
and Mean Integrated Squared Error (MISE) follow, as well as the expression for optimal AMISE 
bandwidths. In addition, for a particular class of densities consisting of mixtures of directional von 
Mises and normals, it is possible to compare the AMISE with the exact MISE. These results have 
been also obtained for the purely directional case, considering mixtures of von Mises distributions 
in the g-dimensional sphere. 

This paper is organized as follows. Section 2 presents some background on kernel density estimation 
for linear data, directional data (Subsection 2.1) and the proposed estimator in the directional-linear 
context (Subsection 2.2). The main results of this paper are included in Section 3, where the bias, 
variance and asymptotic normality for the proposed kernel estimator arc derived. Section 4 is 
focused in the issue of error measurement and expressions for the AMISE of the estimator and the 
exact MISE for particular cases of mixtures are obtained, both in the directional and directional- 
linear contexts. Conclusions and final comments are given in Section 5. The proof of the results 
and some technical lemmas are given in the Appendix. 

2 Background 

This section is devoted to a brief introduction on kernel density estimation for linear and directional 
data. For the sake of simplicity, / will denote the target density along the paper, which may be 
linear, directional, or directional-linear, depending on the context. 

Let Z denote a linear random variable with support Supp{Z) C R and density /. Consider 
Zi, . . . , Zn a random sample of Z, with size n. The linear kernel density estimator introduced 
by Rosenblatt (1956) and Parzen (1962) is defined as 



where K denotes the kernel, usually a symmetric density about the origin, and > is the band- 
width parameter, which controls the smoothness of the estimator. Specifically, large values of the 
bandwidth parameter will produce oversmoothed estimates of /, whereas small values will provide 
under smoothed curves. The asymptotic properties of this estimator and its adaptation to different 
contexts yielded a remarkably prolific field within the statistical literature, as noted in the intro- 
duction. 

It is well known that under some regularity conditions on the kernel and the target density, the 
bias of the estimator (1) is of order 0{g'^), whereas the variance is 0{{ng)^'^)^ clearly showing the 
need of accounting for a trade-off between bias and variance in any bandwidth selection procedure. 
Specifically, the expected value of the linear kernel estimator at z G M is: 



where Hp{K) = z^K^z) dz represents the p-tk moment of the kernel K. Similarly, the variance 
of (1) at z G M is given by: 




(1) 



E [fg{^)\ = f{z) + -iJi2{K)f"{z)g^ + [g^) 



Var fg{z) = {ngr^R{K)f{z) + o {{ng)-^) , 
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where R{K) = K^iz) dz. Further details on computations for the Hnear kernel density estimator 
can be found in Section 2.5 of Wand and Jones (1995). 

As previously mentioned, kernel density estimation has been adapted to different context such as 
directional data, that is, data on a qf-dimensional sphere, being circular data {q = 1) and spherical 
data {q = 2) as particular cases. In the next sections, the directional kernel density estimator will 
be revised, and a directional-linear estimator will be introduced. 

2.1 Kernel density estimation for directional data 

Let X denote a directional random variable with density /. The support of such a variable is the 
g-dimensional sphere, denoted by fig = {x G M*+^ : + • • • + x^.,.]^ = l}- The Lebesgue measure 
in fig will be denoted by coq and, therefore, a directional density satisfies 

/ /(x)a;g(dx) = l. 

Remark 1. When there is no possible misunderstanding, coq will also denote the surface area ofQ,q: 

„ £+1 




g> 1, 



where T represents the Gamma function defined as T{p) = ^e ^ dx, forp > —1. 

The directional kernel density estimator was proposed by Hall ct al. (1987) and Bai et al. (1988), 
following two different perspectives in the treatment of directional data. In this paper, the definition 
in Bai et al. (1988) will be considered, although it can also be related with one of the proposals in 
Hall et al. (1987). Given a random sample Xi, . . . ,X„, of a directional variable X with density /, 
the directional kernel density estimator is given by: 

A(x)=^x:4^^^)' (2) 
1=1 ^ ' 

where L is the directional kernel, h > is the bandwidth parameter and Ch^q{L) is a normalizing 
constant depending on the kernel L, the bandwidth h and the dimension q. The scalar product of 
two vectors, x and y, is denoted by x^y, where T is the transpose operator. 

In this setting, directional kernels are not directional densities but functions of rapid decay. There- 
fore, to ensure that the resulting estimator is indeed a directional density, the normalizing constant 
Ch,q{L) is needed. Specifically (see Bai et al. (1988)), the inverse of this normalizing constant for 
any x G fig is given by 

Ch,q{L)-' = I^L (^1^) <^My) = h''\,iL) ~ /i^Ag(L), (3) 

with \h,q{^) = ojq-i J^^~^ L{r)ri-\2 - rh^)i-Ur and Ag(L) = 22-^Uq^i L{r)ri-^ dr. The 
asymptotic behaviour of Xh^q{L) is established in Lemma 1 and the notation an ~ bn indicates that 
1 as n oo (see also Bai et al. (1988) and Zhao and Wu (2001)). 

Properties of the directional kernel density estimator (2) have been analyzed by Bai et al. (1988), 
who proved pointwise, uniform and i2i-norm consistency. A central limit theorem for the integrated 
squared error of the estimator has been established by Zhao and Wu (2001), as well as the expression 
for the bias under some regularity conditions, stated below: 
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Conditions: 

Dl. Extend / from to {0} by defining /(x) = / ^jj^j for all x / 0, where ||-|| denotes 

the Euclidean norm. Assume that the gradient vector V /(x) — (^~Bxi^^ ' ' ' ) dx^^i ^ cind the 
Hessian matrix Hfix) = (^ gjg^^^ ^ exist, are continuous on M*+^\{0} and square 

integrable. 

D2. Assume that L : [0, oo) -> [0, oo) is a bounded and Riemann integrable function such that 



/•oo 

0< / L''{r)r2-^ dr <oo, > 1, for A; = 1, 2. 
Jo 



D3. Assume that /i = /i„ is a sequence of positive numbers such that /i„ ^ and nhn ^ oo as 
n — >■ oo. 

Remark 2. L must be a rapidly decreasing function, quite different from the bell-shaped kernels K 
involved in the linear estimator (1). To verify D2, L must decrease faster than any power function, 
since r°'r^~^ dr = oo, Va G M, > 1. 

Lemma 2 in Zhao and Wu (2001) states that, under the previous conditions D1-D3, the expected 
value of the directional kernel density estimator in a point x G is 

E [A(x)] = /(x) + bg{L)^if, ■K)h^ + o (/i^) , 

where 

*(/, x) = - x^V/(x) + (V2/(X) - x^1«/(x)x) , (4) 

poo I f'OO 

bq{L)=J L{r)ridr / J L{r)r^-^ dr , (5) 

being V2/(x) = YlUi Laplacian of /. Note that the bias is of order 0(/t^), but in (4), 

apart from the curvature of the target density which is captured by the Hessian matrix, a gradient 
vector also appears. On the other hand, the scaling constant bq{L) can be interpreted as a kind 
of moment of the directional kernel L. Note that, condition D2 with = 1 is needed for the bias 
computation. The same condition with k = 2 is required for deriving the pointwise variance of the 
estimator (2), stated in the following result. 



Proposition 1. Under conditions D1-D3, the variance of fhiyi) at x E ^q is given by 

Var 



A(x)] = ^^d,(L)/(x) + o{{nh'ir^) , 



where 



f'OO I f'OO 

dq{L) = J L^{r)ri-^dr j J L{r)ri-^ dr . 



Regarding the normalizing constant expression (3), the order of the variance is O ((n/i«)"i), where 
q is the dimension of the sphere. This order coincides with the corresponding one for a multivariate 
kernel density estimator in (see Scott (1992)). 



4 



A popular choice for the directional kernel is L{r) = , r > 0, also known as the von Mises kernel 
due to its relation with the von Mises-Fisher distribution (see Watson (1983)). In a g-dimensional 
sphere, the von Mises model t;M(/Lt, k) has density 

9-1 

/^m(x; h, k) = Cq{n) exp {kx^/x}, Cg(At) = ^ , (6) 

(27r) 2 Xq-i (k) 

2 

being /j, e the directional mean and k > the concentration parameter around the mean. In 
Figure 1 (left plot), the contour plot of a spherical von Mises is shown. Ij, is the modified Bessel 
function of order v, 

For the particular case of the target density being a g-dimensional von Mises vM{fji,K), the term 
(4) in the bias computation becomes: 

* ifvMi-, /i, «),x) = «C,(K)e'^"^'^ (-xV + i^q-' (1 - (x^/i)2)) . 

As K ^ 0, which means that the distribution is approaching a uniform model in the sphere, the 
previous term also goes to zero. 

Considering the von Mises kernel in the directional estimator (2) allows for its interpretation as a 
mixture of q-von Mises-Fisher densities 

1 " 

A(x) = -^/,M(x;Xi,l//i2), (7) 

1=1 

where, for each von Mises component, the mean value is i-th observation Xj and the concentration 
is given by A, involving the smoothing parameter. 




Figure 1: Left: contour plot of a von Mises density vM{fj,, k), with /x = (0, 0, 1) and k = 1. Right: contour 
plot of a mixture of von Mises densities (14). 

In addition, the normalizing constant (3) appearing in the construction of the directional kernel 
estimator (2) has a simple expression for a von Mises kernel, given by: 
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with Cq given in (6). 

For a general kernel, the asymptotic behaviour of Ch ,j{L)^^ was remarked in (3) and it can be 
specified for the von Mises kernel. In this case, (8) depends on Cq^n), which involves a Bessel 
function of order {q — l)/2. Applying a Taylor expansion for I^,, it can be seen that = 

^^71^ + ^ l"^"^)) ' ^ — ^ ^^'^ Ch,q{L)~^ presents also a simple form: 

Ch,q{L)-^ = (27r)'^ e-^hi-^e^ (^-^ + O {h^)^ = (27r)2 + O . 

Finally, the other terms involved in bias and variance, namely bq{L) and dq{L), become 

bq{L) = ^,dq{L) = 2-i+\ Vg>l 
for the von Mises kernel. The proofs of these results are collected in Lemma 3. 



2.2 Kernel density estimation for directional— linecir data 

Consider a directional-linear random variable, (X, Z) with support Supp(X., Z) C. QqXM. and joint 
density /. For the simple case of circular data (q = 1), the support of the variable is the cylinder. 
Following the ideas in the previous sections for the linear and directional cases, given a random 
sample (Xi, Zi) , . . . , (X„, Z„), the directional-linear kernel density estimator can be defined as: 

/Ux,.) = ^X:i.if(^^^,^), (x,.)G^^,xM, (9) 

where LK is a directional-linear kernel, g is the bandwidth parameter for the linear component, h 
the bandwidth parameter for the directional component and c^^qiL) is the normalizing constant for 
the directional part, defined in (3). For the sake of simplicity, a product kernel LK{-, •) = L{-)xK(-) 
will be considered along this paper. Although a product kernel formulation has been adopted, the 
results could be generalized for a directional— linear kernel, with the suitable modifications in the 
required conditions. 

In the next section, expressions for the bias, variance and asymptotic normality of the estimator in 
(9) will be given. The proofs of these results can be seen in the Appendix. 



3 Main results 

Before stating the main results, some notation will be introduced. The target directional-linear 
density will be denoted by /. The gradient vector and Hessian matrix of /, with respect to both 
components (directional and linear) are defined in this setting as: 



V/(x,z) 



dfjx, z) df{x, z) dfjx, z) 
dxi ' ' dx, 



dz 







92/(x,z) 
dxidxq^i 


52/(x,^)\ 
dxidz 


nf{x,z) = 


dxq^idxi 




d^f{^,z) 
dxq^idz 




\ dzdxi 


d'f(^,z) 
dzdxq+i 


d'f(^,z) \ 
dz^ / 



{v^fix,z),vj{x,z)y 





'Hx,z/(x,z)\ 


\H^,zfi^,zy 


'H,f{x,z) J 
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where subscripts x and z are used to denote the derivatives with respect to the directional and Hnear 
components, respectively. The Laplacian of / restricted to the directional component is denoted by 

V^/(x,.) = ES^- 

The following conditions will be required along this section. 
Conditions: 

DLl. Extend / from x M to M9+2\A, A = {(x,z) G M«+2 :x = 0}, by defining f{y:,z) = 
f (^11^)^) for all X 7^ and z G M, where ||-|| denotes the Euclidean norm. Assume that 
V/(x, z) and H/(x, z) exist, are continuous and square integrable. 

DL2. Assume that the directional kernel L satisfies condition D2 and the linear kernel K is a, 
symmetric around zero and bounded linear density function with finite second order moment. 

DL3. Assume that h = hn and g = g-a are sequences of positive numbers such that /i„ — > 0, 5„ — > 

and nhngn ^ oo as n — )• oo. 

The next two results provide the expressions for the bias and the variance of the kernel density 
estimator (9). 

Proposition 2. Under conditions DL1-DL3, the expected value of the directional-linear kernel 
density estimator (9) in a point (x, z) G fig x M is given by 

E [A,,(x, z)] = /(x, z) + bgiL)^M, X, z)h^ + ^/X2(i^)W./(x, z)g^ + o {h") + o (y^) , 

where 

X, z) = -x^Vx/(x, z) + q-' (V^/(x, z) - x^Hx/(x, z)x) . 

Proposition 3. Under conditions DL1-DL3, the variance for the directional-linear kernel density 
estimator (9) in a point (x, 2;) G J7g x M is given by 

Var [A,,(x, z)] = ^J^R{K)d,{L)f[^, z) + {{nh'^g)-^) . 
L J ng 

In view of the previous results, some comments must be done. Firstly, the effects of the directional 
and linear part can be clearly identified. For the bias, marginal contributions appear as two ad- 
dends and also the remaining orders from each part are separated. For the variance, the terms 
corresponding to both parts can be also identified, although turning up in a product form. In 
addition, the respective orders for bias and variance are analogous to those ones obtained with a 
{q + l)-multivariate estimator in W^^^ (see Scott (1992)). 

It can be also proved that the directional-linear kernel density estimator (9) is asymptotically 
normal, under the same conditions as those ones used for deriving the expected value and the 
variance, and a further smoothness property on the product kernel. 

Theorem 1. Under conditions DL1-DL3, if I / LK^^^ (r, v) r dvdr < 00 for some 6 > 0, 

Jo Jr 

then the directional-linear kernel density estimator (9) is asymptotically normal: 

z) - /(x, z) - ABias [A,g(x, z)]) ^ M (0, R{K)dg{L)f{^, z)) , 

pointwise in (x, z) G fig x where ABias fh^g{-K,z) = bq{L)"i!x{f,'^,z)h'^ + ^iJ,2{K)'Hzf{'^, z)g'^ . 

The smoothness condition on the directional-linear kernel is required in order to ensure Lyapunov's 
condition and obtain the asymptotic normal distribution. Again, the effect of the two parts can be 
identified in the previous equation, as well as in the rate of convergence of the estimator. 
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4 Error measurement and optimal bandwidth 



The analysis of the performance of the kernel density estimator requires the specification of appro- 
priate error criteria. Consider a generic kernel density estimator /, which can be linear, directional 
or directional-linear. A global error measurement for quantifying the overall performance of such 
estimator is given by the Mean Integrated Squared Error (MISE): 



MISE 



E 



if{u) - f{u)f 



du. 



The MISE can be interpreted as a function of the bandwidth and its minimization yields an optimal 
bandwidth in the sense of the quadratic loss. 



For the linear kernel density estimator (1) and under some regularity conditions (see Wand and 
Jones (1995)), the MISE is given by: 



MISE 



1 



/,(•) =-a2{Kf Ring'' + {n9)-'R{K) + o {(ng)-') 



The asymptotic version of the MISE, namely the AMISE, can be used to derive an optimal band- 
width that minimizes this error. This optimal bandwidth is given by 



5AMISE 



R{K) 



tX2{KYR{f")n\ 



Although the previous expression does not provide a bandwidth value in practice, given that it 
depends on the curvature of the target density R{f"), some interesting issues should be noticed. 
For instance, the order of the asymptotic optimal bandwidth is 0{n~^^^). Also, this result is the 
starting point of more sophisticated bandwidth selectors such as the ones given by Sheather and 
Jones (1991) and Cao (1993). A comparison of the performance of different bandwidth selectors can 
be found in Cao et al. (1994), whereas Jones et al. (1996) povide a review on bandwidth selection 
methods. 

In the previous sections, the bias and variance for the directional kernel estimator (sec Zhao and 
Wu (2001) for the bias and Proposition 1 for the variance) and for the directional-linear kernel 
estimator (Propositions 2 and 3) were obtained. Hence, it is straightforward to get the MSE and 
MISE for these estimators. 

Proposition 4. Under conditions D1-D3, the MISE for the directional kernel density estimator 
(2) is given by 



MISE 



A(-) =b,{Lf I ^I>{f,^fcv^{d^)h^ + ^^^d,{L) + o{inhi)-'). 



n 



Following Wand and Jones (1995), MISE fh{ 



AMISE 



A(-) +o{{nh'i) ^), providing 



AMISE 



fhi') a suitable large sample approximation that allows for the computation of an optimal 



bandwidth with closed expression, minimizing this asymptotic error criterion. 

Corollary 1. The AMISE optimal bandwidth for the directional kernel density estimator (2) is 
given by 



AMISE 



1 

A+q 



.46,(L)2A,(L)i?(*(/,.))nJ ' 
where R{^{f, ■)) = *(/> x)^ ^^(dx) and \q{L) = ll-^ujq-x L{r)ri-^ dr. 
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Expressions for MISE and AMISE can be also derived for the directional-linear estimator. In 
order to simplify the notation, in general, I[(f){-,-)] = 0(x, z) Wq(dx) c/z, for a function : 



Proposition 5. Under conditions DL1-DL3, the MISE for the directional-linear kernel density 
estimator (9) is given by 

MISE [A,,(., •)] =bgiLfl [^M, ; ■?] + l^Kfl [HJ{; -f] / 

+ b,{L)iX2{K)I [^^{f, .)'HJi; •)] h^g"" + ^-^^dg{L)R{K) + {{nhig)-') . 

ng 

Unfortunately, it is not straightforward to derive a full closed expression for the optimal pair of 
bandwidths {h,g) amise, although it is possible to compute them by numerical optimization. How- 
ever, such a closed expression can be obtained for the particular case q = 1, where the circular and 
linear bandwidths can be considered as proportional. 

Corollary 2. Consider the parametrization g = j3h. The optimal AMISE pair of bandwidths 
{h,g) AMiSE = (^AMiSE,,5/i amise) Can be obtained from 



^AMISE 



{q + l)dg{L)R{K) 



m,{L)R{bg{L)^M, ; ■) + ^^^{K)nj{; ■))n 



1 

B+q 



where R{b,{L)^M, ; •) + ^f-i2{K)nj{; •)) = ^^^r (6g(L)vI/,(/, x, z) + ^^^2{K)nJ{^, z)f 
ujq{dx) dz and \q{L) is defined as in the previous corollary. For the circular-linear data case (q = 1), 
the parameter f3 is given by: 



Despite a formal way for deriving the orders of the AMISE bandwidths has not been derived, a quite 
plausible conjecture is that for q > 1, {h,g) amise = [O [n~^^^'^~^'^^) ,0(n~^/^)) or, equivalently, 
that p = Pn = (n-(9-^)/(5(^+«))). Indeed, this is satisfied for q = l. 



Finally, it is interesting to note that considering g = I3h, a single bandwidth for the kernel estimator 
(9) is required, having the optimal bandwidth under this formulation order O (n^^/^^"^^)) . This 
coincides with the order of the kernel linear estimator in Rp, with p = dimQ^ x M = g + 1. 



4.1 Some exact MISE calculations 

Closed expressions for the MISE for the directional and directional-linear estimators can be obtained 
for some particular distribution models, and they will be derived in this section. In the linear setting, 
Marron and Wand (1992) obtained a closed expression for the MISE of (1) if the kernel K is a normal 
density and the underlying model is a mixture of normal distributions. Specifically, the density of 
an r-mixture of normal distributions with respective means ruj and variances (t|, for j = 1, . . . ,r 
is given by 

r r 
fr{z) = J2 Pi^'^i - "^j) ' Pj = 1' Pj ^ 0' 
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where pj, j = 1, . . . ,r denote the mixture weights and 4>a is the density of a normal with zero mean 

and variance a , i.e., (paiz) = trr- e 2,72 . Marron and Wand (1992) showed that the exact MISE 
of the linear kernel estimator is 

MISE[/g(-)] = (27riffn)"' + p'^[(l-n-i)n2(5)- 2^1(5) + fio(5)]p, (10) 
where p = (pi, . . . ,pj.)'^ and ^a{.9) are matrices with entries ^ais) = {^aa{''^i ~ = 
(ag'^ + af + a]^ ' , for a = 0, 1, 2. 

Similar results can be obtained for the directional and directional-linear estimators, when consid- 
ering mixtures of von Mises for the directional case, and mixtures of von Mises and normals for 
the directional-linear scenario (see Figure 2 for some examples). For the directional setting, an 
r-mixture of von Mises with means Hj and concentration parameters Kj, for j = 1, . . . , r is given by 

r r 
fr{^) =^PjfvM{^;flj,Kj), ^Pj = l, Pj>0. (11) 

i=i i=i 

Consider a random sample Xi, . . . ,Xji, of a directional variable X with density fr (see Figure 1, 
right panel) . The following result gives a closed expression for the MISE of the directional kernel 
estimator. 

Proposition 6. Let fr be the density of an r-mixture of directional von Mises (11). The exact 
MISE of the directional kernel estimator (2), obtained from a random sample of size n, with von 
Mises kernel L(r) = e"*" is 



MISE 



hi-)] = {Dgih)n)-' + [(1 - n-')^2{h) - 2*i(/i) + *o(/i)] P, (12) 



where p = {pi, ... ,pr) and Dq{h) = Cq{l/h'^) Cq(2/h'^) . The matrices '^a{h), a = 0,1,2 have 
entries: 



*o(/i) 



Cq{Ki)Cq{Kj) 



Cq{\\KitJ,i + KjlXj\\)J.. 
^^ih) =Cq il/h^) (q(..)C,(.,) CqiMh^l^..^^\\) ^^(^-)) ' 

^2{h) =Cq (V/l^)' lCqiKi)Cq{Kj) [ [Cq (||x//l2 + Cq {\\^/ + 1 | ) ] a;,(dx) 



where Cq is defined in equation (6). 

The matrices involved in (12) are not as simple as the ones for the linear case, due to the convolution 
properties of the von Mises density. For practical implementation of the exact MISE, it should be 
noticed that matrices ^2{h) and *i(/i) can be evaluated using numerical integration in q'-spherical 
coordinates. For clarity purposes, constants Cq^Hi) are included inside matrices ^2{h), ^'i(/i) and 
^o{h) but it is computationally more efficient to consider them within the weights, that is, take 

P = {PlCqini), . . . ,PrCq{Kr)). 

From Proposition 6, it is easy to derive an analogous result for the case of a r— mixture of directional- 
linear independent von Mises and normals: 

r r 

/^(x, z) = ^Pjfvui^; Hj, i^j)4>aj {z - mj) , ^fij = 1, Pj > 0. (13) 
j=i j=i 
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Figure 2: From left to right: circular-linear mixture (15) and corresponding circular and linear marginal 
densities, respectively. Random samples of size n = 200 are drawn. 

Proposition 7. Let fr be the density of an r -mixture of directional-linear independent von Mises 
and normals densities given in (13). For a random sample of size n, the exact MISE of the 
directional-linear kernel density estimator (9) with von Mises-normal kernel LK{r,t) = x 
<^i(i) is 

MISE [fh,g{;-)] =[Dg{h)27r-2gny' 

+ [(1 - n-^)*2(^) o fl2(5) - 2*i(/i) o fliig) + ^o{h) o floig)] p, 

where o denotes the Hadamard product between matrices and the involved terms are defined as in 
Proposition 6 and equation (10). 

Once the exact MISE and the AMISE for mixtures of von Mises and normals are derived, it is 
possible to compare these two error criteria. To that end, let consider the following directional 
mixture 

'^vM ((1, 0,)), 2) + '^vM {{%, 1), 10) + ^vM ((-1, Og), 2) , (14) 
where 0^ represents a vector of q zeros, and the directional-linear mixture 

Im fo, -\ X vM ((1, 0,)), 2) + In (1, 1) X vM ((0„ 1), 10) + \n (2, 1) x vM ((-1, 0,), 2) . (15) 



Figure 3 shows the comparison between the exact and asymptotic MISE for the linear, circular and 
spherical case. As first noted by Marron and Wand (1992) for the linear estimator, there exists 
significative differences between these two errors, being the most remarkable one the rapid growth 
of the AMISE with respect to the MISE for larger values of the bandwidth. This effect is due to the 
fact that, for a general bandwidth h, lim/j^oo AMISE(/i) = co since AMISE(/i) is proportional to 
/i^, whereas the MISE level offs at lim^j^oo MISE(/i) = J f{u)^ du. Besides, for the directional case, 
this effect seems to be augmented probably because of a scale effect in the bandwidths, in the sense 
that the support of the directional variables is bounded, which is not the case for the linear ones 
considered. However, although the AMISE and MISE curves differ significantly, the corresponding 
optimal bandwidths get closer for increasing sample sizes. 

Figure 4 contains the contourplots of the exact and asymptotic MISE for the circular-linear and 
spherical-linear cases. The conclusions are more or less the same as for Figure 3: the asymptotic 
MISE grows rather quickly than the exact MISE for large values of h or g. On the other hand, the 
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contour lines of both surfaces are quite close for small values of the bandwidths and the optimal 
bandwidths also get closer for larger sample sizes. 




Figure 3: From left to right: exact MISE and AMISE for the linear mixture |7V (O, |) + |7V (1,1) + |7V (2, 1) 
and the circular and spherical mixtures (14), for a range of bandwidths between and 1. The black curves 
are for the MISE, whereas the red ones are for the AMISE. Solid curves correspond to n = 100 and dotted 
to n = 1000. Vertical lines represent the bandwidth values minimizing each curve. 



As an inmediate application of Propositions 6 and 7, a bootstrap version of the MISE for the 
directional and directional-linear estimators can be derived. The bootstrap MISE is an estimator 
of the true MISE obtained by considering a smooth bootstrap resampling scheme, which will be 
briefly detailed. In the linear case, the bootstrap MISE is given by 



MISE 



where /;(z) = ^ Er=i ^ 



9p 



z-Z* 



/.(•) 



E* 



, being the sample Z^, . . . , Z* distributed as fgp. In this case. 



' 9^ ' ng ^-^1=1 \ g 

gp is a pilot bandwidth and the expectation E* is taken with respect to the density estimator fgp. 



fg 



that actually 



For the linear case, Cao (1993) derived an exact closed expression for MISE*^ 
avoids the needing of resampling and derived a bandwidth that minimizes the bootstrap MISE by 
previously computing a suitable pilot bandwidth gp. 



The following two results show the bootstrap MISE expressions for the estimators (2) and (9) in the 
case where the kernels are von Mises and normals. As in the linear case, no resampling is needed for 
computing the bootstrap MISE. These bootstrap versions of the error provide an overall summary 
of the estimator behaviour, with no restriction on the underlying densities, as long as von Mises 
and normal kernels are considered. In addition, the following results could be used to derived a 
bandwidth selector, but it will depend on the selection of pilot bandwidths for both components, 
which is not an easy problem. 

Corollary 3. The bootstrap MISE for directional data, given a sample of length n, the von Mises 
kernel L{r) = e"** and a pilot bandwidth hp, is: 



MISEt 



fh{ 



iDg{h)n)-' + n-H^ [(1 - n-i)*^(/i) - 2^l{h) + *S(/i)] 1, 

0,1,2 have the same entries as ^a(^) but with Ki = l//ip and 



where the matrices ^*(/i), a 
/Xj = Xj for i = 1, . . . ,n. 

Remark 3. The particular case where q = 1 and hp = h. Corollary 3 corresponds to the expression 
of the bootstrap MISE given in Di Marzio et al. (2011). 
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Figure 4: Upper panel, from left to right: exact MISE versus AMISE for the circular-linear mixture (15) 
for n — 100 and n = 1000. Lower panel, from left to right: spherical-linear mixture (15) for n = 100 and 
n = 1000. The black curves are for the MISE, where the red ones are for the AMISE. The pairs of bandwidths 
that minimizes each surface error are denoted by (/i, (?)mise and by (/i, (7)amise- 



Corollary 4. The bootstrap MISE for directional-linear data, given a sample of length n, von 
Mises-normal kernel LK{r,t) = x (f)i{t) and a pair of pilot bandwidths {hp,gp), is: 



MISE 



hp,gp 



fh,. 



9\ ' 



Dg{h)27r2 gn 

+ n'H^ [(1 - n-')^Uh) o niig) - 2**i(/i) o fi* (g) + ^*{h) o fi* (g)] 1, 



where the matrices ^^{h) and fl'^{g), a = 0,1,2 have the same entries as ^'a(^) and fl^ig) but 
with Ki = l//ip, /Xj = Xj, rrii = Zi and ai = gp for i = 1, . . . , n. 



5 Conclusions 

A kernel density estimator for directional-linear data is proposed. Bias, variance and asymptotic 
normality of the estimator are derived, as well as expressions for the MISE and AMISE. For the 
particular case of mixtures of von Mises, for directional data, and mixtures of von Mises and nor- 



13 



mals, in the directional-linear case, the exact expression for the MISE are obtained, which enables 
the comparison with their asymptotic versions. 

Undoubtedly, one of the main issues in kernel estimation is the appropriate selection of the band- 
width parameter. Although an optimal pair of bandwidths in the AMISE sense has been derived, 
further research must be done in order to obtain a bandwidth selection method that could be ap- 
plied in practice. This problem extends somehow to the directional setting, where (likelihood and 
least squares) cross-validation methods seem to be the available procedures. However, the exact 
MISE computations open a route to develop bandwidth selectors, for instance, following the ideas in 
Oliveira et al. (2012). In fact, a bootstrap version for the MISE when assuming that the underlying 
mode is a mixture allows for the derivation of bootstrap bandwidths, as in Cao (1993) for the linear 
case. 

A straightforward extension of the proposed estimator can be found in the directional-multidimen- 
sional setting, considering a multidimensional random variable. In this case, the linear part of the 
estimator should be properly adapted including a multidimensional kernel and possibly a bandwidth 
matrix. 
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A Some technical lemmas 



Some technical lemmas that will be used along the proofs of the main results are introduced in this 
section. To begin with, Lemma 1 establishes the asymptotic behaviour of Xh^q{L) in (3). With the 
aim of clarifying the computation of the integrals in the proofs of the main results, Lemma 2 details 
a change of variables in whereas Lemma 3 is used to simplify integrals in VLq. Lemma 4 shows 
some of the constants introduced along the work for the case where the kernel is von Mises and, 
finally, Lemma 5 states the Lemma 2 of Zhao and Wu (2001). 



Detailed proofs of these lemmas can be found in Appendix C. This appendix also includes a rebuild 
of the proof of the Lemma 5, using the same techniques as for the other results, which presents 
some differences from the original proof. 



Lemma 1. The limit of Xh^q{L) = coq-i Jq L{r)r2 ^(2 — r/i^)2 ^ dr, when h ^ 0, is 

lim Xh,q{L) = Xq{L) = 2l-^uq.i / L(r)rt-i dr, (16) 
where cOg is the surface area ofQg, for q > 1. 

Lemma 2 (A change of variables in Qg). Let f be a function defined in and y G fig a fixed 
point. The integral f{yi)ujq{dx) can be expressed in one of the following equivalent integrals: 

[ /(x) a;,(dx) = /' / / (t, (1 - t') i') ^,-i{d^) dt (17) 

= 1'^^ f(iy^(\-e)"^Bgi){\-i^)l-^ujg-^{di)dt, (18) 

where Eg = (bi, . . . , b^)^^^-,^^^^^ is the semi-orthonormal matrix (B'^Bg = Ig and BgB^ = Iq+i) 
resulting from the completion of y to the orthonormal basis {y, bi, . . . ,hg}. 

Lemma 3. Consider x € fig, a point in the q-dimensional sphere with entries (xi, . . . ,Xg+i). For 
all i, j = 1, . . . ,q + 1, it holds that 

0, i + j, 



XiU)q{dx.) =0, / XiXjU)q{dx) 



where Ug is the surface area of fig, for q > 1. 

Lemma 4. For the von Mises kernel, i.e., L{r) = e~^ , r >0, 

Ch,g{L)=e^"^\'i-\2'K)'^Xq_^(l/h% A,(L) = (27r)t, 6,(L) = |, d,(L) = 2-f. 

Lemma 5 (Lemma 2 in Zhao and Wu (2001)). Under the conditions D1-D3, the expected value 
of the directional kernel density estimator in a point x G is 



E 



A(X)1 = /(X) + hq{L)^>{f,^)h' + [h^) , 



where ^{f,yi) and bg{L) are given in (4) and (5), respectively. 
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B Proofs of the main results 



Proof of Proposition 1. The variance can be decomposed in two terms as follows: 



Var 



A(x) 



E 



n 



1 -x'^X 



A(x) 



(19) 



where the calculus of the first term is quite similar to the calculus of the bias given in Lemma 5 
and the second is given by the same result. 



Therefore, analogously to the equation (47) of Lemma 5, the first addend can be expressed as 



n 



/ L\ry^-\2 - h\)i-' / / (x + a^cc) u;,-i(d^) dr, 

Jo J^q-l 



(20) 



just replacing the kernel L by the squared kernel Lp' and where q:x,| = — r/i^x+/i [r(2 — /i^r)] ^ Bq$, G 
Og. By condition Dl, the Taylor expansion of / at x, 



/(x + ax,|) - /(x) =ax,4V/(x) + -cx^^^nf{^)<x^4 + o (a^^^ax,^) . 



Hence, 



(20) =^'h^h'i r L\r)ri-\2 - /iV)§-i|/(x) - r/i2a;,_ix^V/(x) 
n Jo 



+ 2^x^H/(x)x + 



2q 



Ch,q{L) 



n 



(VV(x) - x^H/(x)x) + ruq-io (h^) } dr 



+ 



+ 



h'^UJq-l 



x^V/(x) 



h^UJq-l 



2 



i-zn - 

/ Ch,giL)hiL^{r)ri-\2-h^r)i-^dr 
Jo 

/ Ch,q{L)hiL^{r)r2{2-h^r)2-Ur 
Jo 

/ Ch,q{L)hiL^{r)ri+\2-h\)2-^dr x^H/(x)x 

/ Ch,q{L)hiL^{r)r2{2-h^r)2dr 
Jo 



7o 



c,,„(L)/i«L^(r)r5(2 - /iV)2-i 



(VV(X) - x^H/(x)x) 

.(/.^)V 



(21) 



The integrals in (21) can be simplified. For that purpose, define for /i > and indices i = —1, 0, 1, 
J = 0, 1 the following function: 

<i>h,iAr) = Ch,qiL)hiL\r)ri+\2-h^r)i-nio,2h-2)ir), r e [0,oo). 
As ri oo, the bandwidth h ^ and the limit of (f)h,i,j is given by 

4>ij{r) = l™//^,^,,•(r) = A,(L)-iL2(r)ri+^2i-^-l[o,oo)(r). 
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Applying the Dominated Convergence Theorem (DCT) and the same techniques of the proof of 
Lemma 1 (see Remark 5), it can be seen that: 



( 



poo poo 

Um / 4>h,i,j{r)dr = \{L)-^2i-n L^{r)ri+'dr 

h-^^ Jo Jo 



(16) 



LUq 



-ML), 



i = -l, 

i = 0, 

a)„_i roo r/_^_S-l j_ ' ' ' 



2i-j /o°°L^(r)r-z+^dr 



where eq{L) = L'^{r)r'^ dr j L{r)r''i ^ dr . Then, taking into account that iph^ij[r) dr 
Iq° ^hji''') dr ■ {1 + (1)) the integrals in brackets of (21) can be replaced, obtaining that 



(21) K(L)/(x) + e,(L)^2^(/,x)] + {{nh'^r') , 



n 

The second term in (19) is given by: 

E [A(x)]' = [/(x) + 6,(L)/i2*(/,x) + (/i^)]' 

= [/(x) + 6,(L)/i2^(/,x)]' + ^(/i2). 

The result holds from (22) and (23): 

C/i,g(^) 



(22) 



(23) 



Var 



AW 



n 
1 

n 



which can be simplified into 



Var 



A(x) 



[d,(L)/(x)+e,(L)/i2*(/,x)] 
[/(x) + 6,(L)/i2^r(/, x)] ' + ^ {{nh?)-^) , 

CKq{L) 



n 



'-d,{L)f{^) + o{{nh'^)-^) 



□ 



E 



fh,g{^,z) 



Proof of Proposition 2. Denote the bias of the kernel estimator by Bias fh^g (x, z) 

/(x, z). Applying the change of variables stated in Lemma 2 and then an ordinary variable change 
given by r = ^ , the bias results in: 



Bias 



fh,g{^,z) 



Ch,q{,L) 

9 

Ch,q{L) 



E 



LK 



1-x^X z-Z 



9 Jq. 



--Ch,q{L) 



LK 



LK 



1 — X y z — t 



-/(x,^) 
(/(y,i)-/(x,z)) dtojqidy) 



/l2 ' g 

J^^v] {f{y,z-9v) - f{x,z)) dvojqidy) 



1-x^y 



=Ch,q{L) j^LK ^) (/ ("X + (1 - u^f^Bqi, z-gv)- /(x, z)) 

.2\%-l 



■ (1 - «^)2-^ dvojq-i{di) du 



=Ch,q{L)h'i r I [ LK{r,v) (/((x,z) + a,,,4) - f{^,z)) dvLOq-i{d^) 

Jo Jflg^l JM. 



■r2-'{2-h^r)2-'dr 



=Ch,q{L)h'^ [ L{r)ri-\2-h\)i~^ [ K (v) 
Jo Jr 
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• / (/ ((x, z) + ax,^^) - /(x, z)) Wg-i (d^) dv dr, 



(24) 



where ctx.z,^ = ( — t/i^x + h [r(2 — /i^r)] ^ Bq^, —gvj G fig x M. The computation of the last integral 



in (24) is achieved using the multivariate Taylor expansion of / at (x, z), in virtue of condition DLl: 
/((x, z) + ax,^,^) - /(x, z) = cxl^^^^Vfix, z) + ^Q:x,^,^H/(x, z)a^^^^^ + o (Q:^_^^^ax,z,|) • 

Let denote by 7x^^ = —rh?:x. + h [r(2 — /i^r)] ^ Bearing in mind the directional and linear 

components of the gradient V/(x, z) and the Hessian matrix 'H/(x, z), it follows 

/((x, 2;) + ax,^,^) - /(x, 2;) = [7x,4Vx/(x, z) - gvVzf{y^, z)] 

+ \ [7x4^x/(x, z)7x,€ - '^9vihn^,zf{^, z) + g\''n,f{^, z)] 

Then, the calculus of the integral (/ ((x, z) + ctx.z.^) ~ /(x, 2)) a;g_i(d^) can be split into six 

addends. Second and sixth terms are computed straightforward: 



/ 



-9vVzf{^,z)ujq-i{d$) = -u>q-.igvVzf{yi,z), 



/ g'^v'^'Hzfi^, Z) UJq-i{di) = Uq-i Q^v'^U^f {yi, z). 
JQq-l 

For the first and fourth addends, by Lemma 3, the integration of with respect to ^ is zero: 

/ 7|izVx/(x, z) WQ-i(d^) = - a;5_i/i^rx^Vx/(x, z), 

/ -25i;7x,4^x,z/(x,z)a;<j_i(d^) =2gvujq-ih'^rx^n^^J{x,z). 

jQq-l 

Finally, in the fifth term, the integrand can be decomposed as follows: 

Q 

7x,|^x/(x, 2:)7x,| =h\''x^n^f{x, z)x + /iV(2 - /iV) ^ ^,^,bf Hx/(x, z)hj 
- 2/iVi(2 - h\)-^ • ^CiX^Hx/(x,z)bi. 



(25) 
(26) 

(27) 
(28) 



In virtue of Lemma 3, the third addend vanishes as well as the second, except for the diagonal 
terms. Next, as {x, bi, . . . , bg} is an orthonormal basis in M'^"'"^, the sum of the diagonal terms can 
be computed by simple algebra: 



J]bfHx/(x,z)bi=tr 



Hx/(x,z)^bib 



= tr [Hx/(x,z) (/,+i-xx^)] 



= Vi/(x,z)-x^1^x/(x,2;)x, 

where Vx/(x, z) is the Laplacian of / restricted to the directional component x, Ig+i denoting the 
identity matrix of order q+1 and tr is the trace operator. By Lemma 3 and the previous calculus, 
the fifth term is 



/ 7L^x/(^' ^)7c,z ^q-i{di) =a;g_i/i^r2x'^Hx/(x, z)-x. 
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+ a;g_i/iV(2 - h^r)q-^ [v2/(x, z) - x^Hx/(x, z)x] . (29) 
Note also that the order of ctxz^'^x.z,^ is easily computed: 

"L,! =ro (h^) +v^o{g'^). (30) 
Combining (25)-(30), and using condition DL2 on the kernel K: 

{2A) =Ch,q{L)h'^ r L{r)ri-\2-h\)i-^ I K{v)[ j [-f^^^V^f{^,z)-gvV,f{^,z)\ 

Jo JR JClq-l 

+ I [7L^x/(x, 2;)7^,x - 2gvjl^H^,Ji^, z) + gWn,f{^, z)] 
+ ro [h^) + v^o [g^) ujq-i{d$,) \ dv dr 



=UJ„ 



iCh,giL)hi r L{r)ri-\2-h\)i-^ [ i^(i;)| - /iVx^Vx/(x, z) 
JO JR 



1 



- gvVJi^, z) + - [h^r^^^n^fi^, z) + h^r{2 - h\)q-^ (V^ /(x, z) - ^^n^f{^, ^)x) 
+ gvh^ryi^n^,J{^, z) + ^7^,/(x, z) + ro (h^) + o {g"") | dv dr 

Uq.iCh,q{L)h'i L(r)ri-i(2-/iV)i-i| -/iVx^Vx/(x,z) 



+ 



/iV2x^Hx/(x, z) + h\{2 - h\)q-^ (V^/(x, z) - x^Hx/(x, z)x) 



+ 5'H./(x, ^)Ai2(if)J + {h") + /X2(if)^' (5') I dr. 

For /i>0,i = — 1,0,1, j = 0,l, consider the following functions 

V^h,i,j{r) = Ch,g{L)h'^Lir)ri+\2 - h\y2-n[o,2h-^){r), r G [0, 00). 
When n — >■ 00, h and the limit of ^h,i,j is given by 

^iAr) = li^fo^'MjW = Ag(L)-iL(r)ri+*2i-^l[o,oo)(r-). 



(31) 



Applying Remark 5 of Lemma 1 , 

roo f'OO 

lim / ^ij^h{r)dr = XgiL)-^2'2-^ Liry^+Ur 

h-^O Jo Jo 



( 2l-J' 



(16) 



—ML), 



i = -l, 

i = 0, 



2W S^L{r)r^+Ur .^^ 



L '^«-i/-L(r)r2-^dr 



Then, the six integrals in (31) can be written using fi,j,h{''') ^r = ipij{r) dr ■ {l + o{l)). 
Replacing this in (31) leads to 



(31) = - h^Ug-l 



ML) 



0{l) 



x^Vx/(x,z) 



+ 



h,{L) L{r)r-2+' dr 
^q-i L{r)ri dr 



x'^'^x/(x, Z)X 
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+ 



'q-l 



ML) 



UJq-1 
2 

+ •^9-1 



bq{L) 

CO q-l 
1 



+ o{l) g-^(v2/(x,z)-x^Hx/(x,z)x) 
+ 0(1) o{h^) 

=h\{L) [-x^Vx/(x, z) + q-^ (V^/(x) - x^Hx/(x, z)x)] + ^'^./(x, z)/X2(K) 
=h\{L)^^{f, X, z) + ^^./(x, z)fi2{K) + C7 (/i^) + (/) . 



Proof of Proposition 3. The variance can be decomposed as 

l-x^X z-Z 



Var 



9 



□ 



(32) 



where the calculus of the first term is quite similar to the calculus of the bias and the second is 
given in the previous result. 



Analogous to (24), 



ng^ 



1-x^X z-Z 



Ch,q{Lf 

ng 



/ L^{r)ri-\2-h^r)i-^ / K'^{v) 
Jo 7m 



•/ fiix,z) + a^^;,^^)ujg-i{d$)dvdr, (33) 

just replacing LK by LK^. Then, using that is a symmetric function around zero 

[ K^{v)dv = R{K), [ vK^{v)dv = 0, [ (v) dv = ^2 [K^) , (34) 
Applying the multivariate Taylor expansion of / at (x, z) and (34), equation (33) results in 

' L^{r)ri-\2-h\)'2-^ [ K\v)\ fix, z) - h'^rx'^V^fix, z) 

Jo JR I 



(33) =^^_,^Mi^/,9 



1 r 



gvVJix, z) + - /iV2x^Hx/(x, z) + h\{2 - h\)q-^ (V2/(x, z) - x^Hx/(x, z)x) 



+ gvh\x^n^,J{x, z) + ^n,f{x, z) + ro [h^) + [g^) \ dv dr 



(34) Chg{Lf 



=^^(7-1- 



ng 



j L2(r)ri-^(2 - h\)2-^l RiK)f{x, z) - R{K)h\x^V^f{x, z) 



+ 



R{K) 



h\^x^n^f{x, z) + h\{2 - h\)q-^ (V^ /(x, z) - x^H^fix, z)x) 



+ //2 {K^) ^^./(x, z) + ro {h^) + {g^) \ dr. 



(35) 
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Define the following functions, ioi h > 0, i = —1, 0, 1 and j = 0, 1: 

<Ph,^Ar) = Ch,giL)h'iL\r)ri+\2 - h\)i-n^o^^^-.^{r), r G [0,oo). 

When n ^ oo, /i — >■ and the limit of (t)h,i,j is given by 

cPijir) = hmJh.Ar) = A,(L)-iL2(r)ri+^2i-^l[o,oo)(r-). 

Applying the same techniques of the proof of Lemma 1 to the functions 4>h,i,j with the different 
values of i,j and instead of L, and using the relation (3), it follows: 



hm /" (t>h,i,j{r) dr = \{L)-^2i-^ H L\r)ri+' dr < 

Jo Jo 



2i-J 



\eqiL), 



i = -l, 
i = 0, 



2W f,^ LHr)r^+' dr .^^ 



/o°°L(r)r2-^dr 



where eq{L) = L'^{r)r2 dr j L{r)r2 ^ dr . So, for the terms between square brackets of (35), 
/o°° <^h,i,jif) dr = <^i,j{f ) dr ■ {1 + (1)). Replacing this leads to: 



(35) 



Ch,q{L) 

ng 



R{K)Uq 



dq{L) 



+ 0(1) 



/(x,^) 



- R{K)h^UJq_l 



eq{L) 



+ 0(1) 



x^Vx/(x,z) 



R{K)h'^uj. 



R{K)h?Uq-l 



M2 {K^) 9^^. 



+ 
+ 
+ 

ng 



'9-1 



W<7-1 

1 L^{r)rl^^dr 
;-i Jq°° L{r)r^~^ dr 



+ 0(1) 



2 

eq{L) 



2eg(L) ^ 

OJq-l 

dq{L) 

Wq-1 



^(1) 



9-1 



+ ^(1) 



(V^/(x,z)-x^Hx/(x,z)x) 
H^/(x,2:) 



+ ^(1) 



.(/.^) 



R{K)dq{L)f{x, z) + i?(i^)e,(L)/i2^-^/(x, z) + (i^') dq{L)^nj{^, z) 



+ o{l) + 0{h'^)+o{h')+o{g^) 



Ch,q{L) 

ng 



R{K)dq{L)f{^, z) + R{K)eq{L)h^-^^f{^, z) + /X2 {K^) dq{L)^nj{^, z) 



+ {(nhig)-^) . 
The second term of (32) is 

2 



(36) 



E 



A,fl(x, z) 



-\ 2 



/(X, Z) + bq{L)h'^^{f, X, Z) + |-/X2(if)H^/(x, Z) + ^7 (/l2) + (5^) 



/(X, 2) + bq{L)h^^^{f, X, Z) + ^/X2(if)H./(x, Z) 



+ (^2) + ^ (^2) (37) 
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Joining (36) and (37), 



Var 



Ch,q{L) 



ng 



RiK)dg{L)f{^, z) + R{K)eq{L)h''^^f{^, z) 



+ 1X2 {K^)dg{L)^nJi^,z) 



1 

n 



H 2 



/(x, z) + 6,(L)/i2m/,(/, X, z) + ^MK)nj{^, z) 



+ o{{nh'ig)-^)+o{h^)+o{g''), 



which can be simphfied into 



Var 



Ch,q{L) 

ng 



RiK)dgiL)f{x,z) + o{inh'ig)-'). 



□ 



Proof of Theorem 1. Let {(Xj, Zi)}'^^^ be a random sample from the directional-hnear random 
variable (X, Z), whose support is contained in flq x M. The directional kernel estimator in a fixed 
point (x, 2;) G X M can be written as 



(X, = E Vn,, Vn,i = '-^LK 



1 — X'^^C^ z — Zj, 

K ' 9n 



where notation hn and gn for the bandwidths remarks the dependence on the sample size n given 
by condition DL3. 

As {(Xj,Zj)}^j^ is a collection of independent and identically distributed (iid) copies of (X,Z), 



then {Kt,i}iLi is ^^^o an iid collection of copies of the random variable Vn = LK 
Then, the Lyapunov's condition ensures that, if for some 6 > the next condition holds 



l-x^X 



hi 



z-Z \ 
' 9u )■ 



E 



lim 



2+5 



niVar[K]^+i 



0, 



then the following Central Limit Theorem is valid 

VVar [Vn] 



Ar(0,l), 



where Vn = ^ Ya=i ^>«- '^^^^ condition will be proved for Vn = LK 



l-x^X z-Z 



First of all, the order of E 



IK 



|2+5 



is: 



E 




-1 










9n 



1 — X y z — t 
hi ' g„ 



2+6 



2+5 



2+5 



gn 

Chn,qi^) 
9n 

{2-hlr) 



f{y,t) ujq{dy)dt 
f LK^+' f —1 f{y,t)u;q{dy)dt 

gnhl I I I LK^+^ (r, v) /((x, z) + ax,.,|) (ii;a;,_i(d^) 



Q.q-1 -'IK 



S-1 
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9n J Jo Jng-i JR 

[f{x,z) + o{hl + gl)] 

2+5 



^^^,2f-ia;,_i r [ LK'+' (r, v) ri"^ dv dr ■ /(x, z) 

\ 9n J Jo Jr 

-5 

gnK2i-^Uq-^ / / LK^+^{r,v)ri-Uvdr- fi^,z) 
Jo Jr 



\{L)-V 



9-1 



1+5 



a+5 



L(r)rf-i dr) i^^g. 
On the other hand, by Proposition 3, the variance of Vn has order: 

Ch,q{L) 

9 

R{K)d,{L}fi^,z) 1 



Var [Vn] =^^^^^R{K)d,{L)f{^, z) + o {{Kgn)-') 

9n 



Xq{L) hlgn 
=0 Uhlgn)-') . 



IK - E = O (^E ) (see Remark 4) and by condition DL3, it follows 



Using that E 

that the Lyapunov's condition is satisfied: 



E 



|K-E[K]| 



2+<5 



n2Var[K]^^2 
as n ^ oo. Therefore, 



,n2 {hlgn) 



-(i+f) 









/var 







Ar(0,l), 



pointwise for every (x, 2;) G x M (note that -^/n is included in the variance term). Plugging-in 
the asymptotic expressions for the bias and the variance results 

^nhUn (A„,,„ (x, z) - /(x, z) - ABias [A„,,„ (x, z)]) ^ N (0, R{K)d,{L)fi^, z)) . 



Remark 4. The proof of E 



|v;-E[v;]| 



2+5 



O E 



iKil^^"^ ) is simple. For example, using the 

■■ir 
k\ 

As 2 + 6 > 1, by the triangular inequality and the Newton Binomial: 

E [\Vn - E [Vnf^'] < E [{\Vn\ - |E [Vn]\f^'' 

f'^~^^\ IT/ |2+(5-fe 



Newton Binomial: for any r G M, (x + yY = Ylk^=o (D^'" ^y^' ^^^^ (fc) ^(^ i)---('' fc+i) 



= E 



.fe=0 ^ ^ 
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E 

fc=0 



2 + 5 
k 



E 



\Vn 



2+5-k 



Now, as E [Vn] = /(x, z) + O (/i^ + (7^) = O (1) by Proposition 2 and by condition DL3, the terms 
E [V^] are constants asymptotically. Also, as E [IV^I*^] < E [|V^n for < r < s, it follows 



E 



\Vn-E[Vn]\ 



2+S 



2 + 6 



E 



IK 



,2+5-k 



o {e |v;|2+'' ) 



□ 



Proof of Proposition 4- It is straightforward from Proposition 1 and Lemma 5. For a point x in ri^: 



MSE 


A(x)" 




E 


"A(x)" 


- /(x)" 


2 

+ Var 


A(x)" 



[6,(L)*(/, x)/i2 + {h^)f + ^-h^d,{L)fi^) + ((n/i^)-^) 
h,{L)H{f,yLfh^ + ^Jh!l^dg{L)fi^) + {{nh'^r^) . 



Integrating over Q,q in the previous equation, 



MISE 



A(-) =b,{Lf ^^f^^fu;,{d^)h' + ^^^dg{L) + o{inh'^)-'). 



n 



□ 



Proof of Corollary 1. To obtain the bandwidth that minimizes AMISE, consider that by (3), Ch^q{L) 
Xq{L)~^h~'^ in the previous equation and derive it with respect to h: 



A 
dh 



AMISE 



A(-)l =4bq{LfR{^{f,-))h'-qXgiL)-'h-(<i+'^dqiL)n-' = 0. 



The solution of this equation results in 

^AMISE = 



46,(L)2A5(L)i?(^(/,.))nJ 



1 

4+9 



□ 



Proof of Proposition 5. It is straightforward from Propositions 2 and 3: 
MSE 



fh,g{^,z) 




E 


fh,gi^, Z) 


-/(x,^)" 


2 

+ Var 


fh,gi^,z) 



h\{L)^M, X, z) + |-^./(x, z)^i2{K) + {h^) + {g^) 



-1 2 



+ 



^^i?(K)d,(L)/(x, z) + {{nh'igr^) 



gn 



--h%q{Ly'^M, X, zf + ^f,2{Kfnj{^, zf 



+ h''g\{L)^2{K)nj{^, ^)*x(/, X, z) 



+ 



^-h:S^R{K)dq{L)f{^, z) + {{nh'^gr') 
gn 
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Integrating the previous equation and denoting by / [(/>(-, •)] = 0(x, z) ujq{dK.) dz for a function 



MISE 



+ h''g\{L)f,2iK)I •, Wi; •)] + + {{nh'ig)-') 

gn 



□ 



Proof of Corollary 2. Suppose that g = Ph in the previous equation. Again, use that Ch,q{L) 
\q{L)~^h~^ and derive with respect to h to obtain 



dh 



AMISE 



where 



It follows immediately that 



^AMISE 



{q + i)n 

4(ci + C2 + C3)_ 



5+q 



Given that R{hq{L)^^{f,-,-) + ^ii2{K)n,f {■,■)) =ci + C2 + C3, the desired expression is obtained. 
In the case where g = 1 it is possible to derive the form of /3 by solving J| AMISE fh,gi-, ■) 

and ^AMISE fh,g{-, •) = 0. For this case, /3 has the closed form 



\21 \ 4 



□ 

Proof of Proposition 6. Consider the r-mixture of directional von Mises densities given in (11). 
Then: 



MISE 



(A(x)-/,(x))2a;,(dx) 



A(x)2 - 2A(x)/,(x) + fr{^fojq{d^) 



^2/1-x y 



+ 



Ch,q{Lf{n-l) 



n 



My) i^qidx) ujqidy) 



L[^-^]L(^—^^f,{y)fr{z) 



ujq(dx) iOq{dy) Uqidz) 

- '^Ch,q{L) 



1 — x^y \ 

Li ^2 ) fr{^)fr{y)^q{d-X.)ujq{dy) 
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+ / /,(x)2a;,(dx) 
=(38) + (39) - (40) + (41). 
The four terms of the previous equation will be computed separately. The first one is 



(38) 




L ( — — ) /r (y ) (dx) (dy ) 




^ J ^q f2qr 



Cfe,g(L)2 
= (L>,(/i)n)- 

The second one is: 

Ch,q{Lf{n-l) 



e~(^/^f LJg{d:>c)Cg{Kj)e''^y^'^i Ug{dy) 
[ Cg{K,)e^^y^''^Ug{dy) 



(39) =- 



n 

\2/ 



L 



1 — x-^y\ / 1 — X-' z 



c„,,(L)2(n-l) 



/l2 



L 



] fr{y)fr{^)'^q{d-X.)uJg{dY)ijJq{dx) 



n 



[ [ [ e ^eyeli^^^pjpiCg{Kj)Cg{Ki)e''^y^''^e' 



ujq{dx.) Ug{dy)ug{d7.) 



Ch,g{Lf{n-l) 



n 



e-^''~"Y.Y.P3VlCg{Kj)Cg{Kl) 

3=1 1=1 



^x^ygft 'x^Zg«,y^M.e«,z^w^^((ix)a;5(dy)a;g(dz) 



n 



[{27ry-^h''-'i,_^{h-')y'Y.Y.pmCq{^j)Cg{Ki) 



j=l 1=1 



a;g((ix) 



= (1 - n-l)C, ^^PjPlCgiK,)CgiKl) 

3=1 1=1 



|x/ft,2+Kj7ij||y^ 



\x/h'^ + KjfXj 



i^qidy) 



■/ 

-{l-n-')Cg{l/h')j2j2PjPi [ 



Ug{dz) 



iOg{dx) 



Cq{Kj)Cg{Kl) 



3=1 1=1 



o C,(||x//i2 + K,.^,.||)C,(||XA2 + ^^^^11) 



uig{dx.) 



(l-n-i)p^*2(^)p, 
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where *2(/i)rxr is the matrix with ij-th entry {l/h?) Jn, c,(||x//»^+^';l^||)§(l|x/fe^+/.,w|l) ^^i^^^)- 



The third one results in: 



(40) =Ch,,{L) I I L{ ) /,(x)/,(y) a;,(dx) a;,(dy) 



j = l 1=1 >^^q J Vlq 

^ r r r r 

=Ch,qiL)e~T=^^^PjPlCq{Kj)Cq{Kl) / / 

j=l 1 = 1 ^ 0.q J Qq 



X" MjpKiy /if 



||y//i^+Kj7ij||x' 



UJq{dx) Uq{dy) 
|y/h2+«^M3||y 



r r 

=p^*l(^)p, 



T 

where the matrix ^i{h)rxr has zj-th entry Cq Cq{Kj)Cq{Ki) J^^ Cq{\\f/hHKjiJ.j\\) ^li^y)- Fi" 

nahy, the fourth term is: 



(41)=/ (^^Pj/„M(x;/Uj-,Kj)) Wg(cix) 

■^^9 j = l 

„ r r 

= / Xm^J^'-^"^*^^' Mj, /ti)AM(x, yii;, At;) Wg(dx) 

j = \ 1 = 1 

= j2j2P3PiCq{.n,)Cq{ni) j e'^^-"'^^e«'-"'^'a;,(dx) 
7=1 ;=i -^^9 



= EE^'^^''^^('^^)^«('^') / « ' Vlh.-.+'^<-'ll>'a;,(dx) 



Cg{Kj)Cq{Kl) 



_ 1^1 ' Cq{\\KjHj + KlHi\\) 

=p^*o(/i)p, 

where ^2 {h)rxr represents the matrix with ij-th entry equal to (1/ /i^) c (jlyM^^'+^'^HI) '^'^ '•'^^^ ' 
Note that if KjHj + kiHi — ^1 then ojq{dyi) = -qj^^ = so the result is consistent in this situa- 



tion. 



□ 



Proof of Proposition 7. Consider the 1 — mixture of directional-linear independent von Mises and 



normals /r.(x, z) = X^Li PjfvM{^; IJ'j, i^j)'f>aj {z - mj), X^Li Pj = 1- Hence: 



MISE 



fh,g{'i ■) 



Ch,q{L)^ 



{fh,g{^^ ^) - M^^ ^qid'^) dz 

fh,g{^^ - '^fh,g{^, z)fr{yi, z) + /^.(x, zf uiq{dy:) dz 



LK'^ i - — 7^^,^ — - 1 fr{y,t)ujq{d-x)dzujq{dy)dt 



f^9 J QqXR J flq. 
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+ 



ng 



III 



LK 



1 — X y z — t 



9' 



■ LK 
- 2 



1 — X u z — s 



fr{y, ^)/r(u, s) ujq{dx) dz ujq{dy) dt ujq{du) ds 



Ch,q{L) 



9 JnqXRjQqXl 



LK 



1 — x-^y z — t 



/r(x, Z)fr{y,t) 



■ ujq{dx) dz LOq{dy) dt 

+ / fr{x, z)^ UJq{dx) dz . 



(42) 



As the directional-kernel is a product kernel and the mixtures are independent the directional and 
linear parts can be easily disentangled: 



(42) =n-'Y^pj 
i=i 
1 



/I — X"^y 

L'^ ( p j fvMiy; l^j, l^j) Wg(dx) UJq{dy) 



K' 



z - t 
9 



n n 
3=1 1=1 



■ {t — rrij) dz dt 



1-x'^y 
h? 



L 



1 -x^u 



fvM{y; IJ,j,Kj)fvM{u; Hi, Kl) UJq{dyi) ujq{dy) Uq{du) 



-ill 

9 Jm. Jm. Jw. 



K 



z-t 



K 



z — s 
9 



^<jj {t — fnj)'Pai {s — mi) dz dt ds 



3=1 1=1 
UJq{dx) Uq{dy) 



{z - mj)(j)ai (z-mi) dz dt 



1 
19 



z - t 



n n r „ n r /" 

+ X] XI P^Pl / - ruj) (paiiz - mi) dz / /^,m (x; /a^- , kj ) 



/^,m(x; IJ,i,Kl)ujq{dx) 



(43) 



The directional parts were calculated in the previous theorem and the linear ones were studied in 
Marron and Wand (1992) (see also Wand and Jones (1995), page 26). The combination of these 
two results yields 

(43) = (^Dqih)27r'2ng) + + n-^)p^ [^2{h) o 02(5)] P + p"^ [^i{h) o fl^ig)] p 
+ p^ [*o(/i)of^o(ff)]p, 

2A 2 



where the r xr matrices fta{g) have the fj-th entry equal to 4>aa{mi — mj), a a = \ o,g^ + af + a: 

for a = 0, 1, 2 and ^a{h) arc the matrices of Proposition 6. The notation o denotes the Hadamard 
product between matrices, i.e., if {A)ij = Uij, (B)jj = bij, then (A o B)jj = aijbij. □ 

Proof of Corollary 3. In virtue of equation (7), if the kernel of the density estimator (2) is L{r) = 
e~^', r > 0, then the kernel estimator is the n-mixture of von Mises with means Xj, i = 1, . . . ,n 
and common concentrations l//ip given by (7), where hp is the pilot bandwidth parameter □ 
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Proof of Corollary 4- It follows immediately from the previous proposition and corollary. □ 

C Proofs of the technical lemmas 

Proof of Lemma 1 . Consider the functions 

Mr) = L{r)ri-\2 - /iV)i-H[o,2^-2)(r), 

= lim V3ft(r) = L(r)r2-^2i-H[o,oo) W- 

Then, prove lim/i_>.o Xh,q{L) = Xq{L) is equivalent to prove lim/i^o /o°° ^h{r) dr = ip{r) dr. 

Consider first the case q > 2. As | - 1 > 0, then (2 - h'^r)i~^ < 2i-\ V/i > 0, Vr G [0,2h~^). 
Then: 

\y^h{r)\ < L(r)r2-i2i-il[o,2h-2)(0 < v(.r), Vr G [0,ocO,V/i > 0. 

Because v{f) dr < oo hy condition D2 on the kernel L, then by the DCT it follows that 
lim/j^o /o°° V^h{r) dr = tp{r) dr. 

For the case g = 1, ^Ph{f) = L{r)r~^{2 — h?'r)~^. Consider now the following decomposition: 

/ <ph{r)dr= / L{r)r-2{2-h\)-2l^o,h-^^{r)dr+ / L{r)r-2 {2 - h\)-2 1^,^-2 ^2h-^^{r) dr. 
Jo Jo Jo 

The limit of the first integral can be derived analogously with the DCT. As (2 — /i^r)^ 2 is monotone 
increasmg, then (2 - h'^r)~2 < 1, Vr G [0, h-'^),\/h > 0. Therefore: 

L(r)r-5(2 - /iV) "h [0,^-2 )(r)| < L(r)r-5 l[o,,,-2)(r) < (^(r), Vr € [0,oo), V/i > 0. 

Then, as lim/i^o -^(^)''~^ (2 — /i^r)^ 2 ljQ^^^_2-)(r) = {p{r) and ip{r) dr < 00 by condition D2, DCT 
guarantees that lim/j^o jo° L{r)r^^{2 — /i^r)^2 ljQ^^_2^(r) dr = f^^ (/p(r) dr. 

For the second integral, as a consequence of D2 and Remark 2, L must decrease faster than any 
power function. In particular, for some fixed ho > 0, L(r) < r~^, Vr G 2/i~^), V/i G {0,ho). 
Using this results in: 



lim / L(r)r~ 2 (2 — /t^r)~ 2 c/r < lim / r~ 2 (2 — /i^r)~ 2 cir = lim /i = 0. 

h^oJfi-2 h-^Ojfi-2 h^O 



tn / 

h- 

This completes the proof. 

Remark 5. It is possible to apply the same techniques to prove the result with the functions 

V^h,M,k{r) = L\r)ri^\2 - /iV)i-^l[o,2/»-2)(r), 
<^M,fc(r) = \ixniph,i,j,k{r) = (r)r i +^2 §-^1 [0,00) (r), 

with i = —1,0, 1, j = 0, 1 and k = 1,2. For the cases where | — i > 0, use DCT. For the other 
cases, subdivide the integral over [0,2/i-2) into th e intervals [0,h ^) and [h ^,2/;, Then apply 
DCT in the former and use a suitable power function to make the latter tend to zero in the same 
way as described previously. 

□ 
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Proof of Lemma 2. Following Blumenson (1960), if x is a vector of norm r with components Xj, j = 
1, . . . , n, with respect to an orthonormal basis in M", then the n-dimensional spherical coordinates 
of X are given by 



xi = r cos 



Xj = r cos 4>j sin cp^, j = 2, . . . , ra — 2, 
fc=i 

. nVT- A J = r"-^ TTsin*^<^„_i_fe. 



n-2 



Xn-1 =ri 



(44) 



fe=i 

n-2 



k=l 



Xn = r cos 6 sin 0^, 
jfc=i 



where < 0j < tt, j = 1, . . . , n — 2, < < 27r and < r < oo. J denotes the Jacobian of the 
transformation. Special cases of this parametrization are the polar coordinates (n = 2), 



x-i= r cos ( 
X2 = r sin t 



J = r, 



and the spherical coordinates (n = 3), 



Xl 


= r cos ( 


P, 




= r sin 6 


' sin 




= r cos t 


?sin 



J = r sin ( 



Note that sometimes this parametrization appears with the roles of x\ and xa swapped. 

To continue with the previous notation, let denote q = n — \. Use the spherical coordinates (r = 1, 
as the integration is on n„_i) and then apply the variable change 



t = cos 01, d4>i = —(1 — t^) 2 dt. 
I /(x)a;n-i(c^x) = / f{xi,...,Xn)d{xi,...,Xn) 

^u) r^- („-2) r/ 
Jo Jo Jo \ 

n-2 

• JJ sin*^ (f>n-i-k d(f)n-2 ■■■ d4>i dO 



(45) 



k=l 
2tt fl fir 



n-2 

COS 01 , COS 02 sin 0i , . . . , cos 6* sin 0^ 

fc=i 



n-2 



^''Vo^'y i/o" ^■■■^i /(*'^o^'^2(l-t')^...,cos0n^i^'^fe(l-*')' 

n-3 



fe=2 



■ II sin'' 4>n-l-k{l - t^)'^ {1 - tY d(t)n-2- ■■ d(t)2dtde 

k=l 

" / 1 lo ^ " ^ '^2(1 - i') ^ . . . , cos n 



n-3 

JJ sin*^0n_i_fc(l - t'^)^ ■ ■ ■ d(j)2d9dt 

k=l 



30 



7' / f(t,{l-t')kl,...,il-t')kn-l)il-t' 

J-lJilri-2 

C I f(t,{l-t^)h){l-t^)^iVn-2id^)dt. 

J-iJn„-2 



So, for the q'-dimensional sphere Qq, equation (17) follows. Note that as the parametrization (44) 
is invariant to coordinates permutations permutations and t can be placed in any argument of the 
function. The rest of the arguments will remain having the entries (1 — t^)^~^- 

This expression can be improved using an adequate basis representation. Prom a fixed point y G fig, 
it is possible to complete an orthonormal basis of W^'^^, say {y, bi, . . . , hq}. So an element x G 
will be expressed as: 



x= (x,y)y + ^(x,bi)bi = ty + (1-^2)5^, 



i=l 



where t = (x, y) G [—1, 1] and ^ G Ty = {t] £ Qg : r] ± y}. Related to the basis {y, bi, . . . , hq}, 
there are the orthogonal matrix B = (y, bi, . . . , b^)^^^^^^^^^^^ and the semi-orthogonal matrix 
Bq = (bi, . . . , hq)^^_^-^^^^. Using the fact that B is an orthonormal matrix, is possible to make the 
change x = Bz, with detS = 1 and B~^^q = B^Q,q = (as B preserves distances). Then, the 
relation (18) holds: 

/ f{^)iOg{djc)= [ fiBz)detBujqidz) 

= / f{Bz)0Jq{dz) 

^l-iL ^ ^ ~ ^''t'Bqi) (1 - t^)i-^ U^q-im dt. 

□ 

Proof of Lemma 3. Without loss of generality, assume that, by the (7-spherical coordinates (44), 
Xi = cos 01 and Xj = cos 02 sin0i. Using this, the calculus are straightforward for the integrands Xj 
and XiXf. 



p rziT I'll ^^_i^ ^ 

/ Xi uiq{dx) = / ■ ■ ■ / cos 01 IT sin'^ (pq-k sin*~"^ 0i dcpq-i ■ ■ ■ d(f)i dO 

JClq Jo Jo Jo 

= / / "I W s™'^ 0g-fc d4>q-i ■ ■ ■ d4>2 dO ■ I COS 01 sin"^^^ 01 (i0i 
Jo Jo Jo Jo 



Wq-i -0 = 0, 



/ XiXjUJq{dx)= / / ■■■ / COS 01 COS02 sin0i TT sin*^ 0q_fc sin^~^ 02 sin^~^ (pi 

Jcia Jo Jo Jo j^^^ 

d0g_i • • • (i0i d9 
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„7r 1-3 



Jo Jo Jo 

■ / COS 01 sin^ 01 c?(/>i / cos 02 sin^~^ 02 <^02 
Jo Jo 



=Ua-2 •0-0 = 0. 



The integrand is even simpler, using the fact that the integration is over fi^: 

9+1 ^ -, 9+1 



Wo 



g+1 



□ 



Proof of Lemma 4- For a = 1,2, p = 0,1 and g > 1, the properties of the Gamma function ensure 
that 



L''{r)r2-Pdr 



Jo 



e-^Wi-Pdr 



r(i-j^ + i) 



Therefore: 



The expression for Ch,q{L) arises from the fact that Ch,q{L) = Cq [l/h'^) ^ e^l^^ . 



□ 



Proof of Lemma 5. This proof is a rebuild of the one given in Zhao and Wu (2001) and is included 
for the aim of completeness of this work. Furthermore, many techniques used in this proof are also 
helpful for the proofs of other results in this paper. 



Let denote Bias 



AW 



E 



//j(x) — /(x). To compute the bias, use Lemma 2 for the change 

of variables with the orthonormal and semi-orthonormal matrices B = (x, bi, . . . , bq) and Bq = 
(bi, . . . , bq), and then apply the ordinary change of variables 



r = 



l~t 



dr = -h~^ dt. 



(46) 



This results in: 



Bias 



A(X) =Ch,q{L)E 



L 



=Ch,q{L) / L 



1-x^X 



=Ch,q{L) / L 









T 

- x^y 








T 

- x^ y 



-/(x) 

f{y)ujq{dy)-Ch,q{L) j L 
(/(y) - /(x)) u;q{dy) 



1-x^y 
/i2 



LVq{dy)f{x) 



1 - t 



--Ch,q{L) / / L 

J -I JQq-l 
■ {l-f )2-^UJg^l{d^)dt 



(/ (tx + (l-i2)i5,^)-/(x)) 



(46) 



ChAL)h'i [ [ L(r)(/(x + a^,^)-/(x))rf-H2-/iV)i-i 

Jo Jflq-l 

ujq-i{d^)dr 
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=cn,,{L)h'i [ L{r)ri-\2-h\)i-' j (/ (x + a,,^) - /(x)) 
■ C0q-i{d^) dr, 



(47) 



where ax,^ = —rh^yi + h [r(2 — h?'r)\ ^ Bq$^ G Q,q. By condition Dl, the Taylor expansion of / at x, 
/(x + Qx^) - /(x) =Q!x,^V/(x) + ^Q!x4'H/(x)ax,4 + [oL^^^a^^^] 

and split the calculus of ^ (/ (x + ctx,^) — /(x)) in two parts. For the first use that 

the integration of ^j, i = 1, . . . , g vanishes by Lemma 3: 

f <^V/(x) a;,_i(d^) = - r/i^ /" x^V/(x) 

+ h [r(2 - h'r)] ^ / |^SjV/(x) a;,_i(d$) 

= - r/i2a;g-ix^V./(x) (48) 
In the second, by the results of Lemma 3, 

f cxl^Hf{^)a^4 0Jq-i{d^)=r''h^ [ x^H/(x)xa;,_i(dO 

- 2rh^ [r(2 - h\)] ^ / x^1«/(x)5,^a;,_i(d^) 

+ /lV(2-/iV) / fB^nf{^)BqiUq.^{d$,) 

=r2/tV_ix^H/(x)x 

+ /iV(2 - h^r) / 5^ bf ^i,/(x)b,-^i^,- a;g_i(d^) 

=r2/iV_ix^H/(x)x 

j=l J^q-l 

=r2/tV_ix^H/(x)x 

+ /iV(2 - h\)cjq-iq-^ [VV(x) - x^H/(x)x] . (49) 

In the last step it is used that by ^21=1 ^ibf + xx-^ = BB^ = /g+i , 



^bf7{/(x)b, = tr 



i=l 



H/(x)^bib 



i=l 



tr [l^/(x) (/,+i - xx^)] = VV(x) - x^H/(x)x, 



with /g+i being the identity matrix of order q + 1 and tr the trace operator. 

Apart from this, the order of the Taylor expansion is 

(a^^^CKx^) = {r'^h^ + /iV(2 - h\)) = {r'^h'^ + 2/iV - h\^) = ro {h^) . (50) 
Adding (48)-(50), 

(47) =ujq.^cn,q{L)h<i j L(r)rt-i(2 - h^Y^-H - rh^^^Vf{^) + '^x^1«/(x)x 
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+ ^'''^l ^'""^ (VV(X) - x^H/(x)x) + r ^7 I dr 



+ 



+ 





2g 


h COq. 


-1 


h^UJq 


-1 


2 






-1 


2 



/ Ch,q{L)h'^L{r)r2{2-h\)^-^dr 
Jo 

/ Ch,q{L)h'iL{r)r^+\2-h\)2-Ur 
Jo 

/ Ch,q{L)h'^L{r)r2{2 - h\)2 dr 
Jo 

/ Ch,q{L)h'^L{r)ri{2-h'^r)^-^dr 
Jo 



x^V/(x) 



x^H/(x)x 



g-i (VV(X) - x^H/(x)x) 



(51) 



Consider the following functions for /i > and i,j = 0, 1: 

V'MjW = Ch,q{L)h'^Liry^+\2 - ^M'~'l[o,2/.-2)(r), r G [0,oo). 
When n — )• oo, /i — )• and the limit of ^h,i,j is given by 

^iAr) = }^^,V>h,iAr) = A,(L)-iL(r)ri+^2i-^l[o,oo)W- 
Then, by Remark 5 and Lemma 1: 

poo roo 

lim / iph{r) dr = \q{L)~^2i~^ / L{r)r2^^ dr 

Jq Jq 



(16) J '^q-l " 



i = 0, 



So, for the terms between square brackets of (51), ^hif) df = /o°° '^{i') dr ■ {1 + o (1)). Replacing 
this in (51) leads to 



(51) = - h'ojq-l 



ML) 



+ 0(1) 



x^V/(x) 



+ 



+ 



h'^UJq-l 



'bqiL) L{r)r2+Ur 
^q-i Jq°° L{r)r2 dr 



+ 0(1) 



x^H/(x)x 



ML) 



+ 0{1) 



q-' (VV(x) - x^H/(x)x) 



+ ^(1) 



=h%q{L) [-x^V/(x) + q-^ (V2/(X) - x^H/(x)x)] + O {h^) + {h-") 
=h\{L)^{f,^) + o{h^). 



□ 
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